Identifying Meaningful Citations

نویسندگان

  • Marco Valenzuela
  • Vu Ha
  • Oren Etzioni
چکیده

We introduce the novel task of identifying important citations in scholarly literature, i.e., citations that indicate that the cited work is used or extended in the new effort. We believe this task is a crucial component in algorithms that detect and follow research topics and in methods that measure the quality of publications. We model this task as a supervised classification problem at two levels of detail: a coarse one with classes (important vs. non-important), and a more detailed one with four importance classes. We annotate a dataset of approximately 450 citations with this information, and release it publicly. We propose a supervised classification approach that addresses this task with a battery of features that range from citation counts to where the citation appears in the body of the paper, and show that, our approach achieves a precision of 65% for a recall of 90%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Applying MetaMap to Medline for identifying novel associations in a large clinical dataset: a feasibility analysis

OBJECTIVE We describe experiments designed to determine the feasibility of distinguishing known from novel associations based on a clinical dataset comprised of International Classification of Disease, V.9 (ICD-9) codes from 1.6 million patients by comparing them to associations of ICD-9 codes derived from 20.5 million Medline citations processed using MetaMap. Associations appearing only in th...

متن کامل

Corpus and Method for Identifying Citations in Non-Academic Text

We attempt to identify citations in non-academic text such as patents. Unlike academic articles which often provide bibliographies and follow consistent citation styles, non-academic text cites scientific research in a more ad-hoc manner. We manually annotate citations in 50 patents, train a CRF classifier to find new citations, and apply a reranker to incorporate non-local information. Our bes...

متن کامل

Towards understanding the de-adoption of low-value clinical practices: a scoping review

BACKGROUND Low-value clinical practices are common in healthcare, yet the optimal approach to de-adopting these practices is unknown. The objective of this study was to systematically review the literature on de-adoption, document current terminology and frameworks, map the literature to a proposed framework, identify gaps in our understanding of de-adoption, and identify opportunities for addi...

متن کامل

The Wisdom of Citing Scientists

This Brief Communication discusses the benefits of citation analysis in research evaluation based on Galton's "Wisdom of Crowds" (1907). Citations are based on the assessment of many which is why they can be ascribed a certain amount of accuracy. However, we show that citations are incomplete assessments and that one cannot assume that a high number of citations correlate with a high level of u...

متن کامل

Effectiveness of different databases in identifying studies for systematic reviews: experience from the WHO systematic review of maternal morbidity and mortality

BACKGROUND Failure to be comprehensive can distort the results of a systematic review. Conversely, extensive searches may yield unmanageable number of citations of which only few may be relevant. Knowledge of usefulness of each source of information may help to tailor search strategies in systematic reviews. METHODS We conducted a systematic review of prevalence/incidence of maternal mortalit...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015